Skip to content

[WIP][Cling] Refactor CIFactory and add incremental action support#21903

Draft
SahilPatidar wants to merge 4 commits into
root-project:masterfrom
SahilPatidar:incr_action
Draft

[WIP][Cling] Refactor CIFactory and add incremental action support#21903
SahilPatidar wants to merge 4 commits into
root-project:masterfrom
SahilPatidar:incr_action

Conversation

@SahilPatidar

Copy link
Copy Markdown

This Pull request:

This PR refactors CIFactory and adds support for IncrementalAction (from Clang-Repl). The goal is to reduce the amount of Clang-specific logic handled inside CIFactory and rely more on Clang’s existing infrastructure.

Changes or fixes:

Checklist:

  • tested changes locally
  • updated the docs (if necessary)

This PR fixes #

@devajithvs devajithvs self-assigned this Apr 13, 2026
@github-actions

github-actions Bot commented Apr 13, 2026

Copy link
Copy Markdown

Test Results

    21 files      21 suites   3d 8h 21m 17s ⏱️
 3 861 tests  3 861 ✅ 0 💤 0 ❌
72 443 runs  72 443 ✅ 0 💤 0 ❌

Results for commit 2e181c5.

♻️ This comment has been updated with latest results.

@SahilPatidar

Copy link
Copy Markdown
Author

@vgvassilev
Most of the failing tests are related to Clad:

TEST FAILURES:
1:cppinterop-CppInterOpTests
185:gtest-hist-hist-TFormulaGradientTests
186:gtest-hist-hist-TFormulaHessianTests
271:gtest-math-mathcore-CladDerivatorTests
309:minuit2_testADMinim
330:gtest-roofit-histfactory-testHistFactory
350:gtest-roofit-roofitcore-testRooFuncWrapper
353:gtest-roofit-roofitcore-testRooMinimizer
361:test-stressroofit-codegen
426:tmva-sofie-test-TestCladAutodiff
430:gtest-tmva-sofie-TestGemmDerivative
922:tutorial-math-fit-exampleFit3D
945:tutorial-math-fit-minuit2GausFit

Some of them show errors like:

error: static assertion failed: clad doesn't appear to be loaded; make sure that you pass clad.so to clang.
        static_assert(false, "clad doesn't appear to be loaded; make sure that "
                      ^~~~~

This may be happening because I made rough changes in FrontendAction::CreateWrappedASTConsumer, where it now directly returns the DeclCollector from IncrementalAction::CreateASTConsumer without adding the plugin consumer.

Previously, I mentioned an issue related to the Clad plugin consumer. There was a problem with CladPlugin::initializeSema adding a SemaExternalSource, which I have resolved.

If we allow the plugin consumer, then during BeginSourceFile, when setASTConsumer is called, it triggers Initialize. This ends up stealing and replacing the CompilerInstance’s consumers. I’m not fully sure why this happens, but it appears to be causing many of the test failures.

In the original Cling implementation, as far as I can see, we never call Plugin::Initialize (so no replacement happens). Instead, we take the plugin consumer and pass it to the DeclCollector consumer.

We need to find a way (to pass plugin into DeclCollector or something else) to solve this first before moving forward.

@vgvassilev

Copy link
Copy Markdown
Member

When clad is OFF we should not run these tests. This means that we need to suppress them in cmake. cc: @guitargeek

Are the related llvm/clang changes really required? I think we should do extra effort to avoid them..

@SahilPatidar

Copy link
Copy Markdown
Author

Those are in CodeGenAction.cpp. On my local system, they were causing a segmentation fault. I’m not sure why, but after commenting them out, the crash stopped:

// PrettyStackTraceDecl CrashInfo(*D.begin(), SourceLocation(),
//                                Context->getSourceManager(),
//                                "LLVM IR generation of declaration");

And another change related to CompilerInstance and SourceManager initialization. As I mentioned, in Cling we use one large virtual source file to keep source locations consistent.

For now, I added a temporary helper:

runIncrementalSourceMgrInitializer

@vgvassilev

Copy link
Copy Markdown
Member

Those are in CodeGenAction.cpp. On my local system, they were causing a segmentation fault. I’m not sure why, but after commenting them out, the crash stopped:

// PrettyStackTraceDecl CrashInfo(*D.begin(), SourceLocation(),
//                                Context->getSourceManager(),
//                                "LLVM IR generation of declaration");

And another change related to CompilerInstance and SourceManager initialization. As I mentioned, in Cling we use one large virtual source file to keep source locations consistent.

For now, I added a temporary helper:

runIncrementalSourceMgrInitializer

Sounds suspicious. Maybe we should run valgrind.

@SahilPatidar

Copy link
Copy Markdown
Author

We need to address these issues in the current setup:

  • Based on the original Cling setup, CladPlugin should be passed to DeclCollector, not to CompilerInstance.
  • initialization of the SourceManager to avoid clang side work.

For SourceManager related:
Currently, we use one large virtual file as the main file to generate unique source locations. We create this using custom logic during SourceManager initialization. However, clang::FrontendAction::BeginSourceFile does not provide a way to do this without modifying its behavior.

So, I tried overriding the MainFileID inside clang::FrontendAction::BeginSourceFileAction. I moved the source initialization code there. It builds fine, and I did not notice any other test failures compare to previous version.

I also tried another approach (just to test how it behaves). I let BeginSourceFile initialize the SourceManager as usual, and then I created a new virtual file. This file uses the end location of the main file when creating its FileID, so it acts like an includer instead of the main file.

const char* Filename = "<<< includer >>>";
FileEntryRef FE = FM.getVirtualFileRef(Filename, 1U << 15U, time(0));

SM.setFileIsTransient(FE);

SourceLocation Result = SM.getLocForEndOfFile(SM.getMainFileID());

m_VirtualFileID =
    SM.createFileID(FE, Result, SrcMgr::C_User);

auto Buffer = llvm::MemoryBuffer::getMemBufferCopy("/*CLING DEFAULT MEMBUF*/;\n");

SM.overrideFileContents(FE, std::move(Buffer));

After this, I did not see any issue while building. But when I ran roottest-root-tree, a few tests started failing. One of them failed with this error:

3484: /Users/sahilpatidar/Desktop/root/roottest/root/treeformula/stl/runpair.C:4:11: error: redefinition of 'file' as different kind of symbol
3484:    TFile *file = TFile::Open("RefTest.root");
3484:           ^
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/sys/kernel_types.h:54:8: note: previous definition is here
3484: struct file;

This error happens inside CheckVariableDeclaration:

bool DeclExtractor::CheckForClashingNames(
    const llvm::SmallVector<NamedDecl*, 4>& Decls,
    DeclContext* DC) {

  for (size_t i = 0; i < Decls.size(); ++i) {
    ...
    else if (VarDecl* VD = dyn_cast<VarDecl>(ND)) {
      LookupResult Previous(*m_Sema, ND->getDeclName(), ND->getLocation(),
                            Sema::LookupOrdinaryName,
                            RedeclarationKind::ForVisibleRedeclaration);
      m_Sema->LookupQualifiedName(Previous, DC);
      m_Sema->CheckVariableDeclaration(VD, Previous);
      ...

I am not sure why this is happening, and I am also not sure any of these way is correct? Can we do something better?

@vgvassilev

Copy link
Copy Markdown
Member

Can you debug side by side with the working version?

@SahilPatidar

Copy link
Copy Markdown
Author

I found the issue.

The second approach (using a different virtual file instead of the main file) was failing because of a check in DefinitionShadower::Transform. This code decides whether to put declarations inside the __cling_N5 namespace to avoid name conflicts.

It uses typedInClingPrompt for this check. If it returns true, the declaration is wrapped inside the namespace.

In my case, since I used another virtual file instead of the main file, this function returned false. Because of that, the declaration was not added to the namespace, which caused the name collision error.

static bool typedInClingPrompt(FullSourceLoc L) {
  if (L.isInvalid())
    return false;

  const SourceManager &SM = L.getManager();
  const FileID FID = SM.getFileID(L);

  return SM.isFileOverridden(SM.getFileEntryForID(FID)) &&
         (SM.getFileID(SM.getIncludeLoc(FID)) == SM.getMainFileID());
}

@devajithvs

Copy link
Copy Markdown
Contributor

In my case, since I used another virtual file instead of the main file, this function returned false. Because of that, the declaration was not added to the namespace, which caused the name collision error.

Maybe we can add a check to see if the buffer name starts with input_line

Something similar to what we have here:

assert(FE->getName().contains("input_line_"));

@SahilPatidar

Copy link
Copy Markdown
Author

I tried it, and now the tests match the working version on my machine. If second approach is correct to do in Cling, then the only remaining issue is the Clad plugin.

@vgvassilev

Copy link
Copy Markdown
Member

For Clad, here is where we steal the consumers and tune them: https://github.com/vgvassilev/clad/blob/db7f52fafcb45bf8cbaa8b0c7256c664c553e940/tools/ClangPlugin.cpp#L154 and here https://github.com/vgvassilev/clad/blob/db7f52fafcb45bf8cbaa8b0c7256c664c553e940/tools/ClangPlugin.h#L173

I suspect some of these assumptions are incorrect after your patch.

@SahilPatidar

Copy link
Copy Markdown
Author

From the code, I see that in the original setup we never called Initialize for the Clad consumer. I also checked in the debugger and confirmed we never did any “stealing” of the consumer.

In the original version, we just passed the plugin to the DeclCollector consumer here:
https://github.com/vgvassilev/root/blob/7c320ae5313332784afd0727e95f1e9cb99ac44b/interpreter/cling/lib/Interpreter/IncrementalParser.cpp#L339

In the new version, we pass the plugin consumer in CI during FrontendAction::BeginSourceFile, and setASTConsumer calls Initialize, which does the "stealing.” This is the behavior difference I see.

That’s why I directly return the consumer from FrontendAction::CreateWrappedConsumer without adding the Clad consumer for now, which causes Clad-related test failures. If I allow the Clad consumer to be passed in CI, other many tests fail.

@vgvassilev

vgvassilev commented Apr 25, 2026

Copy link
Copy Markdown
Member

From the code, I see that in the original setup we never called Initialize for the Clad consumer. I also checked in the debugger and confirmed we never did any “stealing” of the consumer.

In the original version, we just passed the plugin to the DeclCollector consumer here: https://github.com/vgvassilev/root/blob/7c320ae5313332784afd0727e95f1e9cb99ac44b/interpreter/cling/lib/Interpreter/IncrementalParser.cpp#L339

In the new version, we pass the plugin consumer in CI during FrontendAction::BeginSourceFile, and setASTConsumer calls Initialize, which does the "stealing.” This is the behavior difference I see.

That’s why I directly return the consumer from FrontendAction::CreateWrappedConsumer without adding the Clad consumer for now, which causes Clad-related test failures. If I allow the Clad consumer to be passed in CI, other many tests fail.

Ok, we will need to think how to circumvent the stealing on clad’s side. Can you think of a way?

EDIT: Maybe we steal only when the preprocessor has no CLING macro defined? Or even if we don’t have incremental extensions defined.

@SahilPatidar

Copy link
Copy Markdown
Author

I need to understand the behavior when we make changes on the Clad side. How can I modify the Clad plugin and recompile it on root side?

@vgvassilev

Copy link
Copy Markdown
Member

I need to understand the behavior when we make changes on the Clad side. How can I modify the Clad plugin and recompile it on root side?

Not easy. One is edit the source in the build folder clad-prefix, then go to the build directory clad-prefix/…/build and do “make install”. Then cd to the top-most root_build and call “make Cling”. That should be all you need.

@SahilPatidar

Copy link
Copy Markdown
Author

I made a few changes to check if this works.

First, I commented out Clad::ClangPlugin::Initialize to stop it from taking the ASTConsumer from CI.

Second, I changed this and updated Clad::ClangPlugin::Initialize (even though it is commented out). This function expects clad;plugin at the end of the multiplex, but I moved it to the beginning:

PluginASTAction::ActionType getActionType() override {
  // return AddAfterMainAction;
  return AddBeforeMainAction;
}

In the original setup, we add clad;plugin before the codegen consumer, so I made this change to keep that order.

Third, I updated ParseOrWrapTopLevelDecl to call cladplugin, since it is in CI and not in DeclCollector:

// if (ADecl && !m_Consumer->HandleTopLevelDecl(ADecl.get())) {
if (ADecl && !S.getASTConsumer().HandleTopLevelDecl(ADecl.get())) {

After these changes, the tests that were failing on Ubuntu build are now passing.

I want to understand how to avoid initialization (so it does not take the consumer), and when we should do that.

I don’t see any other test failures locally after these changes:

roottest-root-tree
roottest-root-hist
roottest-root-math
io-stdarray
roottest-cling

@SahilPatidar

Copy link
Copy Markdown
Author

Clad-side changes - SahilPatidar/clad@f6ad58d.

@vgvassilev

Copy link
Copy Markdown
Member

Clad-side changes - SahilPatidar/clad@f6ad58d.

Does the clad test suite pass?

@SahilPatidar

Copy link
Copy Markdown
Author

I didn’t check on the Clad side. I’ll open a draft PR to see whether it passes Clad-CI.

@SahilPatidar

Copy link
Copy Markdown
Author

Currently, we are failing with this kind of error:

In file included from <<< cling interactive line includer >>>:2:
/Users/sftnight/ROOT-CI/build/include/ROOT/RFloat16.hxx:11:10: fatal error: module 'std' is needed but has not been provided, and implicit use of module files is disabled
#include <cstdint>
         ^
Error: Parsing #pragma failed /Users/sftnight/ROOT-CI/src/core/base/inc/LinkDef.h

The build works locally, so it looks like some flag is changing the build behavior in CI, and our changes are triggering it.

While building locally, I noticed this flag might be causing the different behavior after our changes:
-DCMAKE_CXX_STANDARD=23
Any idea what might be causing this?

@devajithvs devajithvs added the clean build Ask CI to do non-incremental build on PR label May 22, 2026
@devajithvs

Copy link
Copy Markdown
Contributor

Currently, we are failing with this kind of error:

In file included from <<< cling interactive line includer >>>:2:
/Users/sftnight/ROOT-CI/build/include/ROOT/RFloat16.hxx:11:10: fatal error: module 'std' is needed but has not been provided, and implicit use of module files is disabled
#include <cstdint>
         ^
Error: Parsing #pragma failed /Users/sftnight/ROOT-CI/src/core/base/inc/LinkDef.h

The build works locally, so it looks like some flag is changing the build behavior in CI, and our changes are triggering it.

That's because the CI was compiling incrementally. I've added a flag to clean build. Can you rebase and push again to trigger a rebuild?

@SahilPatidar

Copy link
Copy Markdown
Author
cmake -Droottest=On \
      -G Ninja \
      -Dtesting=On \
      -DCMAKE_BUILD_TYPE=Debug \
      -DLLVM_ENABLE_ASSERTIONS=On \
      ../

I mean the build succeeds locally with the command above, but I get the same error if I add these flags:

cmake -Droottest=On \
      -G Ninja \
      -Dtesting=On \
      -DCMAKE_BUILD_TYPE=Release \
      -DLLVM_ENABLE_ASSERTIONS=On \
      -Dlibcxx=OFF -Dmathmore=ON -DCMAKE_CXX_STANDARD=23 ../

Currently, I am trying to build with the following configuration. If this works, then it seems that the CMAKE_CXX_STANDARD=23 flag is causing some issue with the new changes:

cmake -S ../ -B . -G Ninja  -DCMAKE_BUILD_TYPE=RelWithDebInfo "-DCMAKE_CXX_STANDARD=17" "-DLLVM_ENABLE_ASSERTIONS=ON" "-Dall=OFF" "-Darrow=OFF" "-Dasan=OFF" "-Dasimage=ON" "-Dasserts=ON" "-Dbuiltin_cfitsio=ON" "-Dbuiltin_clang=ON" "-Dbuiltin_cling=ON" "-Dbuiltin_fftw3=ON" "-Dbuiltin_freetype=ON" "-Dbuiltin_ftgl=ON" "-Dbuiltin_gl2ps=ON" "-Dbuiltin_gsl=ON" "-Dbuiltin_gtest=ON" "-Dbuiltin_llvm=ON" "-Dbuiltin_lz4=ON" "-Dbuiltin_lzma=ON" "-Dbuiltin_nlohmannjson=ON" "-Dbuiltin_openssl=ON" "-Dbuiltin_openui5=ON" "-Dbuiltin_pcre=ON" "-Dbuiltin_tbb=ON" "-Dbuiltin_tiff=ON" "-Dbuiltin_unuran=ON" "-Dbuiltin_vdt=ON" "-Dbuiltin_xrootd=ON" "-Dbuiltin_xxhash=ON" "-Dbuiltin_zlib=OFF" "-Dbuiltin_zstd=ON" "-Dccache=ON" "-Dcefweb=OFF" "-Dcheck_connection=ON" "-Dclad=ON" "-Dclingtest=OFF" "-Dcocoa=ON" "-Dcoverage=OFF" "-Dcuda=OFF" "-Dcurl=ON" "-Ddaos=OFF" "-Ddataframe=ON" "-Ddavix=OFF" "-Ddcache=OFF" "-Ddev=OFF" "-Ddistcc=OFF" "-Dfail-on-missing=OFF" "-Dfcgi=OFF" "-Dfftw3=ON" "-Dfitsio=ON" "-Dgdml=ON" "-Dgminimal=OFF" "-Dgnuinstall=OFF" "-Dgsl_shared=OFF" "-Dgviz=OFF" "-Dhttp=ON" "-Dimt=ON" "-Dlibcxx=OFF" "-Dmacos_native=OFF" "-Dmathmore=ON" "-Dmemory_termination=OFF" "-Dminimal=OFF" "-Dminuit2_omp=OFF" "-Dmpi=OFF" "-Dopengl=ON" "-Dpyroot=ON" "-Dpythia8=OFF" "-Dqt6web=OFF" "-Dr=OFF" "-Droofit=ON" "-Droofit_multiprocess=OFF" "-Droot7=ON" "-Drootbench=OFF" "-Droottest=ON" "-Druntime_cxxmodules=ON" "-Dshadowpw=OFF" "-Dshared=ON" "-Dsoversion=OFF" "-Dspectrum=ON" "-Dsqlite=ON" "-Dssl=ON" "-Dtest_distrdf_dask=OFF" "-Dtest_distrdf_pyspark=OFF" "-Dtesting=ON" "-Dtmva-cpu=ON" "-Dtmva-cudnn=OFF" "-Dtmva-gpu=OFF" "-Dtmva-pymva=OFF" "-Dtmva-rmva=OFF" "-Dtmva-sofie=OFF" "-Dtmva=ON" "-Dunfold=ON" "-Dunuran=ON" "-During=OFF" "-Dvdt=ON" "-Dvecgeom=OFF" "-Dwebgui=ON" "-Dwin_broken_tests=OFF" "-Dwinrtdebug=OFF" "-Dx11=OFF" "-Dxml=ON" "-Dxrootd=ON" -DROOT_COMPILEDATA_IGNORE_BUILD_NODE_CHANGES=ON

@SahilPatidar

Copy link
Copy Markdown
Author

I fixed the issue. It now builds successfully locally with the same configuration. Next, I’ll run ctest.

After that, I want to know how to test the Clad-related changes on CI. Can I modify the CMake file in the PR to point to my Clad branch? I mean, just changing this in the PR:

list(APPEND _clad_extra_settings GIT_REPOSITORY https://github.com/vgvassilev/clad.git)
list(APPEND _clad_extra_settings GIT_TAG v2.2)

@vgvassilev

Copy link
Copy Markdown
Member

I fixed the issue. It now builds successfully locally with the same configuration. Next, I’ll run ctest.

After that, I want to know how to test the Clad-related changes on CI. Can I modify the CMake file in the PR to point to my Clad branch? I mean, just changing this in the PR:

list(APPEND _clad_extra_settings GIT_REPOSITORY https://github.com/vgvassilev/clad.git)
list(APPEND _clad_extra_settings GIT_TAG v2.2)

Yes, exactly.

@SahilPatidar

Copy link
Copy Markdown
Author

I've updated the CMake and fix the std error. Let's see if it fixes the issue.

@vgvassilev

Copy link
Copy Markdown
Member

Can you resolve the conflicts?

@SahilPatidar

Copy link
Copy Markdown
Author

This is coming from recent clad CMake changes. I rebased recently, but later reverted that rebase because while rebuilding ROOT (even clean build) I started getting some build failures from CMake config and some library related issues, like:

CMake Error at src/CMakeLists.txt:120 (ADD_LIBRARY):
  Cannot find source file:
    FTLibrary.h

There was no conflict with my changes. So for now I reverted the rebase to continue build and test the CMake update changes.

I can rebase again if this blocks CI build. This CMakeLists change will also revert once we test this on CI.

@SahilPatidar

Copy link
Copy Markdown
Author

Without rebasing, can we not run the CI build?

@devajithvs

Copy link
Copy Markdown
Contributor

Without rebasing, can we not run the CI build?

No, we can’t. The build script rebases against master before running the build and tests.

@vgvassilev

Copy link
Copy Markdown
Member

Can you rebase again to make sure we get a green build?

@vgvassilev

Copy link
Copy Markdown
Member

Hi @smuzaffar, can we check this PR against cmssw?

@smuzaffar

Copy link
Copy Markdown
Contributor

@vgvassilev , I have started cmssw tests via cms-sw#232

Comment on lines +1299 to +1300
// if (DiagOpts.VerifyDiagnostics)
// Diags->setClient(new clang::VerifyDiagnosticConsumer(*Diags));

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this? I suspect that this code is used for cling standalone to run the tests. However, maybe with the new changes we don't need it?

if (CI->getCodeGenOpts().TimePasses)
CI->createFrontendTimer();
// if (CI->getCodeGenOpts().TimePasses)
// CI->createFrontendTimer();

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise.

std::unique_ptr<ASTConsumer> Consumer,
clang::Preprocessor& PP) {
m_IncrParser = IncrParser;
// m_IncrParser = IncrParser;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise -- why don't we need that anymore?

@@ -0,0 +1,99 @@
//--------------------------------------------------------------------*- C++ -*-
// CLING - the C++ LLVM-based InterpreterG :)
// author: Axel Naumann <axel@cern.ch>

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the wrong attribution -- please add your name.

bool Initialize(llvm::SmallVectorImpl<ParseResultTransaction>& result,
bool isChildInterpreter);
clang::CompilerInstance* getCI() const { return m_CI.get(); }
// clang::CompilerInstance* getCI() const { return m_CI.get(); }

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should drop that commented code.

// // ready before we parse code for the first time. Without C++ modules
// // we can't setup the calls now because the clang PCH currently just
// // overwrites it in the Initialize method and we have no simple way to
// // initialize them earlier. We handle the non-modules case below.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also fix that.

# list(APPEND _clad_extra_settings GIT_REPOSITORY https://github.com/vgvassilev/clad.git)
# list(APPEND _clad_extra_settings GIT_TAG v2.2)
list(APPEND _clad_extra_settings GIT_REPOSITORY https://github.com/SahilPatidar/clad.git)
list(APPEND _clad_extra_settings GIT_TAG clad-plugin)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that change is in master -- so we can route this to vgvassilev/clad.git but the master branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clean build Ask CI to do non-incremental build on PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants